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1. introduction Querying on tabular data 

present m web pages. «r f ™r anablina deve loper a ro qaickly create * 

ActivsTahle consists of three ^^^^U 5huw/s a u « er so me quick 

1 • 2S2SE SSSSES on a i, 

2 TheSserQucry component, which enable usars to aok their own questions, 
using multiple data s..n.rces data • 

3 The DataMining component, which enauies usera 
mining questions, across multiple data sources. 

In accordance with the subcomponents Mentioned above, ActiveTable™ 
itSS! )S ^TSUts, and a„ MM ^ired to 

support the FastPattem mechanism BUurC M 
2 • Support for th» Us^rQuery component and mulbplo data source 
3. Support for the DataMining component 

This documenl addresses the design rortt* first stage of ^^Table 
de^lopmenl. viz. Basic ActiveTable uumponanto. and Pattern support. 

Section 2 Heals with the ActiveTebl* ardYrtecturo, and disuses the component 

classes in detail. 

1 .1 . Document Convention* 

Throughout this document, 
and properties of a class wi 



2. Components of AetiveTable™ 

AetiveTable wilt be implemented as a jivbBbw.. that can bo visually manipulated 
. and customised using a butldortool. 

A^v^T^^ ^ to put in P h«. further 

mYlhnda far creebng anO customizing the Active li ibie This section dealo with all 
Ihs classes required for supporting this basic functionality. 
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exposftd to th0d9veloper. 

(including indexing "^fT^l ^t^J^Zlin 
about the look and foal of the data on the user s screen. 

,„ add8ion, for Fas.Pattem **«°^&^^^^ 
information related to the analyses i> ue performed for me iu. 

An ActtveTabte must havo a Pmm, ^^^£XZZt££5, 
delayed un ^^""^ M ?£TJ*d ■» *» me usar 

Sianism (using the Index dnu) is then created fa. this table. 

An Arable otao contains . 1 ^dimon*c ; na. A ^^ dE,ement8 ' TheSy 
.apment the colls of Una tahte displayed by the AcUvhTbMb. 

2.1.1.1 Constructors 

21 1.1.1 Pul.ilin 

This constructor crepes * new ActivcTable object with all property «tf to 
NULL. 

2 1 . 1 1.2 Public AotiveTablc (Dat-OourceTypaEnum rtewiDaiaSmiireTypc. String 

nowHrimafyDaiiiSQurce) 
This constmctor creates a new ActlveTanle object with the 
nr mS^aSoureeType property wi to nswDataSourceType. ami the 
^Sa^!"sourirp y rSpB P ay P S e^ .ewP^ryD.faSourc. .t alio does the 

fOllOWirifj . 

1 rentes 8 now instance of *. DataSouree class by import data from 

2 cSTtS>T^?J^ errayotobjects of class GndE.ement), using the 
D^^b»cl uiiteri above. and defaull values for gr.d size, font 
weight. Tont s;2e and ralor. 

3. Destroys the DataSource object created above. 
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nmMnmyOabSoum, Attributed aster newFa S tr*tternAn3!yse 5 [ J) 
This constructor creates a new ActiveTable object with the 
primeryDatoSourcoTyp* poperty set to nowUalaSoufceType. Ihft 
primaryDataSourcg prop*ily set to nswWma/yOa^Sou/ce, and he 
£s«tt»«iAn«ftwesf J pru^rty set to newFestPattemAnvlysesf]. It also does 
the tollowinq : 

1. Creates a new Instance nf a DataSource class by imparling data from 
nowPrfmaryDulaSuurcs t 

2 Creates laftutarDala f ; (an srray of objects of class GndElement), using the 
DataSource utij«ct created above, and default values for grid see. Tom 
weight., font sIt-.s and color t , ... 

3 Creates tehlelndex. an index of the values jawing in each of the 

" AttrilnifeClusters in mc fesfPatternAna/y.sPs array, and fastHatternToWs, 
an array containing totals of the columns in fh* fastPetiemAnalyses array 
(Ihia is concurrent with Step 2, to avoid multiple passes on the data) 
4. Destroys the DataSourco object creaind above. 



2.1.1.2 PropertiHS 

2.1 .1.2.1 Private DataSourceTypeEnum primaryDvtaSourceType 
PrimaryDatnSourceType contains the Primary Data Source Type (ASCII / 
ODBC / HTML) 

2 1 1.2.2 Private Siring pr'tmaryDatoSQUrce 

primaryDztaSource is a siring that contains the name of the Primary Data 
Source of the ActiveTable 

2.1.1.2.3 Private GridElemant tabularDainl ] 

TabvlarData ia a two-dimensional array of GridElements, representing the 
actual tablK displayed. 

2.1 .1 .2.4 Pnvate Dimension defouifGridSize 

defaultGridSize contains the default grid size fur the grid elements. 

2.1.1.2.5 Private iru tirSaitHFontWcight 

DefaultFontWcight contains the default font weight of the text displayed In the 
grid element. 

2.1 -1 .2 Private int defsultFontSitt 

DefaultFontSlz* contain* the default font size tor the text displayed in the. grid 
element. 
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^112 7 Private Color defau/tf-onf Co/or 

DefauttFontColor contains the default font color of the texL displayed m the gnd 
element. 

2112ft Private Index teblctndox 

Tabtelndex contains indexing information about the table ilhtplayed by the ActiveTable. 

2 112 9 Private AttrlbuteClust«r fastPotterr\Analvso£[ 1 

FastPatternAnatyses * an array of AttrlbuteCluster* that focilrtatQ FastPattern 

analysis. 

2. 1.1. 2. 10 Private Columnlnfu fasfPaftemTofate [] 

PasiPatternTotals is an array containing information about thO totals tor various 
columns appearing in the te&tPatternAnBlyscs array. 

2.1.1.2.11 Private IntoBalloon fasiPnttaniBalloon 

FastPattr.rnBullQQn contains the FastPattern infomiatinn to be displayed to the user 
when sne mcvsG her mouse around on I he ActiveTable. 

2.i.i.2.12Privftte FIFO ftstPattorn Cache 

FQstFQttemCache contain* * cache of the previous HastPattems viewed by tha user. 



2.1.1.3 Methods 

2.1 .1 .3.1 Public void setHrimaryDataSnurueType (DataSourcci ypeEnum 
n&wPrimaryDBtaSwnceType) 

Sets primaryDataSourceType to newPrimwyDataSnurceTypc. 

2.1.1.3.2 Public Dat-dSinirceTypeEnum getPrimaryDetaSourcvTyfie () 
Returns dataSourcoType 

2.1 .1 .3.3 Public void s©tPr/nw>yX>at3SatJrce (String newPr/V/ia/yDataSouree) 
Sets primaryDataSource to newPrimaryDataSourve. 

? 1 . 1 .S.4 Puolic String getPrimaryDataSource () 
Returns primaryData&ourcti. 

2.1.1.3.5 
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2.1.2. T/ifi ActiveTabfeB&anlnfo Class 
11.3. The GridElemcnt Class 
2.1.4* The Datalmport Class 
2 7.5. The AscitDatalmport Class 
2.1 A. The ODBCDQtQtmport Class 

2.1.7. The HTMLDatatmport Class 

2.1.8. The Index Class 

2.1.9. The GridBI&m&nt Class 

2.2. FoatPattcrn Support 
2.2.7. The Analysis Class 
2.2,2. The Balloon Class 
Z2.3. ThvFlFOClasx 



Tfie various 



elements of the ActiveTable Applet are explained below : 



1. ActiveTable Applet 

2. Grid Element 

3. ActiveTsbla 

4. Text Area 

5. Billions 



2.3. GridElemcnt {extends TextAiea) 

A Gridblemcnt is an objwdthat contains a single olement of th« data to be 
diSDlavcd and analyzed. Ills not a JavaBean, and is not direci y available for 
manS but asrveo as a building block for an ArtiveTaolo. 

2.3.1. Consiructors 

2.3.1.1 Public Void Grid£lBment() 

uonstruote a "floating*' GridHem S nt object of dstoutt six* with riata - 0.0. 

2 3 12 Public Void <3ridElement(lnt row, int col, float data) 

Constructs a GridElement objert of default size anchored to the specified row 

and column, with data as specified. 

2 3 13 Public Void GridElement (int row, int col. Dimension Size) 
Constructs a GridElBm«nt object of the specified size, anchored to the specified 
row and column. 

2 3 14 Public Void GridElement (int row. int col, Dimension Size, float data) 
Constructs a GridElemcnt object of the specified size, anchored to the specified 
row and column, and containing the data 3pcofi9d. 
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2.3.2. Properties 

2.3.2.3 Private Dimension size 
Specifies the size of the GridElement. 

2.3.24 Private boolean niimsric data. If this *lem R nt .true. 

M^^^S^i — * contalns sLrin ° (textua,) 

data, 

2 3 2 5 Private float data . . . 

^cities the data contained in the GrtdEl.m«it. (if isomeric « true) 

2,3.2.5 Private String dlsplayTenl 

Specifies the text displayed h the Grid £le ^^ u,e data 

Ar.tiveTable. DisplayTexl always contains the stung representation o 
contained in the Gi id Element. 

2.3.2,7 Private int fontSiz© 

Specifies the font size Tor the Gridblcment 

2,3.2.5 Private int fontWolght 

Specifies the font weight for the GridElement 

2.3.2.9 Private boolean bold 

Specifies whether the GridFlement oontento are to b* displayed in bold font 
2 3.210 private boolean italicized 

Specifies whether the GridElement contents ore to b9 displayed in rfaftoa. 

2.3.2.11 Private Color color 

Spenifies the color of the GridElement contents 

Methods 
2.3.3.1 Public int getRow () 

Returns the row (value of ln« row property) to wnlch the GndGement ic 
anchored. 
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2.3.3.2 Public void setRow (int r,cwRow) ^ w 
Sftts the row to whtah the GridBement .s armored (eete Ihe row p.op 
nftwRov/) 

2.3.3.3 Public int getcolumn () firidElement is 
Returna the column (value of the column p.^rty) to Much 

anchored. 

to newColumn) 

2.3.3.5 Public Dimension getSize () ^^ m ^t 
Returns the *• (value of the *o property) of the Gndfcloment. 

2 3.3.6 Puhlic void aetSize (Dimension newSize) 

^ the size of the QrtdHHmert. (Set, the pmparty to ntwSu*) 

2 3 37 Public boolean isNumerlc () 

(returns the value of the numeric property) 

2 3 3.8 Public void aetNumeric (boolean newNumerlc) 

SelL the numsrib property of the GridEtomant to nertiimanc. 

5 2 2 9 Public fluat getData 0 t , ^ 

*• data confined in Ihe. GridEiement (value of«« ctato property). 

233 10 Public void setData (float nnwData) 

S'r e data contain in the GnHEteff-nt. («* the *h property to newData). 

2 3 311 Public String getDisplayTexlO 

Rfl^n. text displayed oy the GridElem^ (value of the cfepfay J o* 

property) 

2 3 3 12 Public void setDisplayText (Siring newDisplayText) 

Set* 'the taxt displayed by the GridFlement (sett the dM»y™ property to 

newDisplayTexl). 

2 3-3.13 Public int getFontSize Q 

Retun,, the font size of th. GrUElem** (value of the tartS** prcporty). 

2 3 3 14 Public: void aetFontStee (int newFontSizc) 

Set, the font size of the Grid=lement. (Sets the fontSize popony to 

new/FnntSize)* 
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2 3 3 15 Public Int yetFontWeight {) 

Returns the font weight of the GridEleme-t. (value of the fontWe ig H). 
2 3 316 Public void setFontWeight (int nawFontW^ight) 
Seta the font we.ghl of ihe GridElcment the fonflrVbflM properly to 
nowhontWeJgftt). 

2 3 317 Public boolean isBold () t 

Returns .formation .bout whether the to* in the GridElement is Aspiayed >n 

bold letters, (value of the hold property). 

2 3 3 1fi Public void sotBold (boolean newBold) 

Sets the bold properly of the GridElement to newRnld. 

2 33 19 public boolean isltalir.ized 0 

Returns information abont whether the text in the GhdH-ment to nodnd. (value 
of the ]fa//c/zed property). 

2 3 3 20 Public boolean seUtalicizfcd (boolean newltelicized) 
Sets the Mazod property of th* Gridfclemcnt to newllalicl7«d. 
2 3.3.21 Public Color gctColor{) 

Returns the current color of Ihe text displayed .n the GridElnment (value of tho 
colf>r prperty) 

2 3 3 22 Puhlic void sctColCT (Color newColor) 

Rets the color of the GrldElen,«r,t (sets the color property to newColor). 

2 3.3.23 Public void displayBalloon (Balloon newBalloon) 

Displays newBelloon art trie QriuEtamenf 3 lower right corner. 

2.3.4. Events 

2.4. ActiveTable 

2.5. Text Area 

2.6. Buttons 

An ActiveTable can have one or more analyses associated with it. Thase are the 
I^CSoX the nature of th. pattern displayed 1 to .the 
implementation They are rcpresanwd using the AttnbuteCluster class. 
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Issues in Data Mining on tho lnt*m«t 
The following tew* are considered ..thin document : 
1 consumer vs. Corporate user 

This ic more of a thin ^^^T H S!^ exist, it would be preferable 
■Thin Cliont" : ** memory <»n*™* -.5^ lti su\ts across to the client. 

the clienL, and mine the data on the client 



Scenario 



IPros 



Cons 



Data Mining on the client 



Dais Mining on the 
server 



Nn special server 
software required to 
process Data Mining 
requests 
Web/ Database 
Server is not loaded 
No extra traffic on 
the network 



"1 , Client may tafce longer 
to show Data Mining 
results 



4, 



Special servei 
software requi i ed 
Extra load on Web / 
Database s«rver 
Server software 
repeats work for 
same Data Mining 
queries from 

different clients 

Extra traffic on 

network, to transmit 

results 



i . Ssrver is saleable, 
and hence, might be 
faster 



On^arpoXintranet, however, this mi S ht be posahto. 

2. Runtime (usQr) vs. Design-time (developer) mining 

D-Ia«*n* Mining Using Ja voB^ ^j^^f^^l^ 

the queries are sen) along with the object. ^nttnthe client, and 

Runtime Mining : Only the data m.n.ng wide and aota are b*m 
the actual mining is performed there. 
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("Scenario 



Design-time Mining 



Pros 



1. User does not hav* 
to wait for mining 
results 

2. The (presumably 

higher) 

computational power 
of the server can be 
used 



Cons 



Runtime Mining 



1 , No extra load on the 
network (as thft 
client does her own 
mining) 
, Client mines data 
only whan r&quired, 
so all rftftults arc 
always seen by the 
usar . 



T7 hxtra traffiu on the 
network (as the Data 
Mining resulls have to 
bo sent along with the 
applet) 
2 t The results sent to the 
client may not ever be 
seen by t he user 
1. User may have a long 
wait for mining results 
(especially for "thin" 
clients) 



3. Small tables vs. large tables 

The Sm.lh»tf- of a table, with .egard to its amenability Tor Data Mining. .8 

limited to 3 or 4 , CDn 

b The number of unique values for each attribute chuse.n 

c. I ho depth of mining required -this can be limited 

d. The number of rumrds to be mined. 

Mininq is faster on small tables than on large iables 
The tab?* Wlnees". more than any of ths above factors, should gu,de the. 
decision uf where ths data mininq is performed. Consider . 
. a it is not always poasfcle to determine whether e potc nbal c ent is I Jin or 

"fat" However, it is always possible to determme the s'smallne^ of a table. 

allhe time when the ActiveTable™ is lifting designed, 
b. The tbility of the client to mine data m as short a Jnj » ^^J"^ 

related to the "smallnosc" of the table b*.ng mined. It the table B small 

even an "ultra-thin" client may be able to im M. "d ir 

1i. H table is huge, even a bic server may take minutes to mm* A 



We^oVcndtnton the Jav.**c*oX Bridge to cor^ cods > from one 
r^nrfpl i to the other This has to be tested thoroughly lo ensure that the 
. co^^Ui " modlteelion. may have to be made. 
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mining date purely onttie server, or pulrty on Ui» «.-m. 
Th* design aqofflrun «mulrl be as tollows . 

investigated). MWnnn i-^milta on a "reposUnry" on the 

* SM? tahI? U do not store the results on the client appl«L 

5. For big tahl«s . do ™V 6 ™? .. , and the uscr requests sornft rnining 

"rfipostory" entity, and displays the results returned. 
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C Start J 
_ * 



Take inputs from deveinper 



j Mine Data | 

T 



Slnre results in server "repository 

- c 



^ is Table \y 

\ "small"? x 



L 5 ! 



in 



Store results in applet 



Save Applet 



r 



(^Stop J 



Design Procedure 
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2. 

J tVSaTllmes to answer the samo qurtc*. f ^ ""^^r 
!io results are sloreri, it will moroly have to felr.l. the result, and tranbmu 

ITbSJS-* cache reautts on the die„l. evan for big tables, so thai 
rap 5iIiS do not hevo to be made to the server. Thts speeds up 



5. 
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